Model Selection

High-fidelity audio

# High-fidelity audio

Voxpolska V1 Merged 16bit

VoxPolska is an advanced model focused on Polish text-to-speech conversion, capable of generating natural, fluent, and expressive Polish speech.

Speech Synthesis

Transformers Other

Inspiremusic Base

InspireMusic is a unified toolkit focused on music generation, song generation, and audio generation, featuring high audio quality and long-form music generation capabilities.

Audio Generation

Safetensors English

An audio denoising and voice enhancement model based on Pytorch, which effectively removes audio noise and improves voice clarity

Audio Enhancement

Musicgen Stereo Melody Large

MusicGen is a text-to-music generation model that supports stereo and melody guidance, capable of producing high-quality music samples based on text descriptions or audio prompts.

Audio Generation

Bark is a Transformer-based text-to-audio model created by Suno, capable of generating highly realistic multilingual speech, music, background noise, and simple sound effects.

Speech Synthesis

Transformers Supports Multiple Languages

Tts Transformer Zh Cv7 Css10

A Transformer-based text-to-speech model built on fairseq S^2, supporting Simplified Chinese with a single female voice, trained on Common Voice v7 and CSS10 datasets.

Speech Synthesis Chinese

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase